A Tunable Machine Vision-based Strategy for Automated Annotation of Chemical Databases
Identifieur interne : 000818 ( Main/Exploration ); précédent : 000817; suivant : 000819A Tunable Machine Vision-based Strategy for Automated Annotation of Chemical Databases
Auteurs : Jungkap Park ; Gus R. Rosania ; Kazuhiro SaitouSource :
- Journal of chemical information and modeling [ 1549-9596 ] ; 2009.
Abstract
We present a tunable, machine vision-based strategy for automated annotation of virtual small molecule databases. The proposed strategy is based on the use of a machine vision based tool for extracting structure diagrams in research articles and converting them into connection tables, a virtual “Chemical Expert” system for screening the converted structures based on the adjustable levels of estimated conversion accuracy, and a fragment-based measure for calculating intermolecular similarity. For annotation, calculated chemical similarity between the converted structures and entries in a virtual small molecule database is used to establish the links. The overall annotation performances can be tuned by adjusting the cutoff threshold of the estimated conversion accuracy. We performed an annotation test which attempts to link 121 journal articles registered in the PubMed to entries in the PubChem which is the largest, publicly accessible chemical database. Two cases of tests are performed and their results are compared to see how the overall annotation performances are affected by the different threshold levels of the estimated accuracy of the converted structure. Our work demonstrates that over 45% of articles could have true positive links to entries in the PubChem database with promising recall and precision rates in both tests. Furthermore, we illustrates that Chemical Expert system which can screen the converted structures based on the adjustable levels of estimated conversion accuracy is a key factor impacting the overall annotation performance. We propose that this machine vision based strategy can be incorporated with the text-mining approach to facilitate extraction of contextual scientific knowledge about a chemical structure, from the scientific literature.
Url:
DOI: 10.1021/ci900029v
PubMed: 19621901
PubMed Central: 2907084
Affiliations:
Links toward previous steps (curation, corpus...)
- to stream Pmc, to step Corpus: 000089
- to stream Pmc, to step Curation: 000089
- to stream Pmc, to step Checkpoint: 000164
- to stream Ncbi, to step Merge: 000073
- to stream Ncbi, to step Curation: 000073
- to stream Ncbi, to step Checkpoint: 000073
- to stream Main, to step Merge: 000826
- to stream Main, to step Curation: 000818
Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en">A Tunable Machine Vision-based Strategy for Automated Annotation of Chemical Databases</title>
<author><name sortKey="Park, Jungkap" sort="Park, Jungkap" uniqKey="Park J" first="Jungkap" last="Park">Jungkap Park</name>
<affiliation><nlm:aff id="A1"> Department of Mechanical Engineering, University of Michigan, Ann Arbor, Michigan 48109,<email>jungkap@umich.edu</email>
,<email>grosania@umich.edu</email>
</nlm:aff>
</affiliation>
</author>
<author><name sortKey="Rosania, Gus R" sort="Rosania, Gus R" uniqKey="Rosania G" first="Gus R." last="Rosania">Gus R. Rosania</name>
<affiliation><nlm:aff id="A2"> Department of Pharmaceutical Sciences, University of Michigan, Ann Arbor, Michigan 48109,<email>kazu@umich.edu</email>
</nlm:aff>
</affiliation>
</author>
<author><name sortKey="Saitou, Kazuhiro" sort="Saitou, Kazuhiro" uniqKey="Saitou K" first="Kazuhiro" last="Saitou">Kazuhiro Saitou</name>
<affiliation><nlm:aff id="A1"> Department of Mechanical Engineering, University of Michigan, Ann Arbor, Michigan 48109,<email>jungkap@umich.edu</email>
,<email>grosania@umich.edu</email>
</nlm:aff>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">PMC</idno>
<idno type="pmid">19621901</idno>
<idno type="pmc">2907084</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2907084</idno>
<idno type="RBID">PMC:2907084</idno>
<idno type="doi">10.1021/ci900029v</idno>
<date when="2009">2009</date>
<idno type="wicri:Area/Pmc/Corpus">000089</idno>
<idno type="wicri:Area/Pmc/Curation">000089</idno>
<idno type="wicri:Area/Pmc/Checkpoint">000164</idno>
<idno type="wicri:Area/Ncbi/Merge">000073</idno>
<idno type="wicri:Area/Ncbi/Curation">000073</idno>
<idno type="wicri:Area/Ncbi/Checkpoint">000073</idno>
<idno type="wicri:doubleKey">1549-9596:2009:Park J:a:tunable:machine</idno>
<idno type="wicri:Area/Main/Merge">000826</idno>
<idno type="wicri:Area/Main/Curation">000818</idno>
<idno type="wicri:Area/Main/Exploration">000818</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a" type="main">A Tunable Machine Vision-based Strategy for Automated Annotation of Chemical Databases</title>
<author><name sortKey="Park, Jungkap" sort="Park, Jungkap" uniqKey="Park J" first="Jungkap" last="Park">Jungkap Park</name>
<affiliation><nlm:aff id="A1"> Department of Mechanical Engineering, University of Michigan, Ann Arbor, Michigan 48109,<email>jungkap@umich.edu</email>
,<email>grosania@umich.edu</email>
</nlm:aff>
</affiliation>
</author>
<author><name sortKey="Rosania, Gus R" sort="Rosania, Gus R" uniqKey="Rosania G" first="Gus R." last="Rosania">Gus R. Rosania</name>
<affiliation><nlm:aff id="A2"> Department of Pharmaceutical Sciences, University of Michigan, Ann Arbor, Michigan 48109,<email>kazu@umich.edu</email>
</nlm:aff>
</affiliation>
</author>
<author><name sortKey="Saitou, Kazuhiro" sort="Saitou, Kazuhiro" uniqKey="Saitou K" first="Kazuhiro" last="Saitou">Kazuhiro Saitou</name>
<affiliation><nlm:aff id="A1"> Department of Mechanical Engineering, University of Michigan, Ann Arbor, Michigan 48109,<email>jungkap@umich.edu</email>
,<email>grosania@umich.edu</email>
</nlm:aff>
</affiliation>
</author>
</analytic>
<series><title level="j">Journal of chemical information and modeling</title>
<idno type="ISSN">1549-9596</idno>
<idno type="eISSN">1549-960X</idno>
<imprint><date when="2009">2009</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass></textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en"><p id="P1">We present a tunable, machine vision-based strategy for automated annotation of virtual small molecule databases. The proposed strategy is based on the use of a machine vision based tool for extracting structure diagrams in research articles and converting them into connection tables, a virtual “Chemical Expert” system for screening the converted structures based on the adjustable levels of estimated conversion accuracy, and a fragment-based measure for calculating intermolecular similarity. For annotation, calculated chemical similarity between the converted structures and entries in a virtual small molecule database is used to establish the links. The overall annotation performances can be tuned by adjusting the cutoff threshold of the estimated conversion accuracy. We performed an annotation test which attempts to link 121 journal articles registered in the PubMed to entries in the PubChem which is the largest, publicly accessible chemical database. Two cases of tests are performed and their results are compared to see how the overall annotation performances are affected by the different threshold levels of the estimated accuracy of the converted structure. Our work demonstrates that over 45% of articles could have true positive links to entries in the PubChem database with promising recall and precision rates in both tests. Furthermore, we illustrates that Chemical Expert system which can screen the converted structures based on the adjustable levels of estimated conversion accuracy is a key factor impacting the overall annotation performance. We propose that this machine vision based strategy can be incorporated with the text-mining approach to facilitate extraction of contextual scientific knowledge about a chemical structure, from the scientific literature.</p>
</div>
</front>
</TEI>
<affiliations><list></list>
<tree><noCountry><name sortKey="Park, Jungkap" sort="Park, Jungkap" uniqKey="Park J" first="Jungkap" last="Park">Jungkap Park</name>
<name sortKey="Rosania, Gus R" sort="Rosania, Gus R" uniqKey="Rosania G" first="Gus R." last="Rosania">Gus R. Rosania</name>
<name sortKey="Saitou, Kazuhiro" sort="Saitou, Kazuhiro" uniqKey="Saitou K" first="Kazuhiro" last="Saitou">Kazuhiro Saitou</name>
</noCountry>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000818 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000818 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Ticri/CIDE |area= OcrV1 |flux= Main |étape= Exploration |type= RBID |clé= PMC:2907084 |texte= A Tunable Machine Vision-based Strategy for Automated Annotation of Chemical Databases }}
Pour générer des pages wiki
HfdIndexSelect -h $EXPLOR_AREA/Data/Main/Exploration/RBID.i -Sk "pubmed:19621901" \ | HfdSelect -Kh $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd \ | NlmPubMed2Wicri -a OcrV1
This area was generated with Dilib version V0.6.32. |